AITopics | Maracaibo

Collaborating Authors

Maracaibo

A Psychology-based Unified Dynamic Framework for Curriculum Learning

Meng, Guangyu, Zeng, Qingkai, Lalor, John P., Yu, Hong

arXiv.org Artificial IntelligenceAug-9-2024

Directly learning from examples of random difficulty levels is often challenging for both humans and machine learning models. A more effective strategy involves exposing learners to examples in a progressive order, from easy to difficult. Curriculum Learning (CL) has been proposed to implement this strategy in machine learning model training. However, two key challenges persist in CL framework design: defining the difficulty of training data and determining the appropriate amount of data to input at each training step. This paper presents a Psychology-based Unified Dynamic Framework for Curriculum Learning (PUDF), drawing inspiration from psychometrics. We quantify the difficulty of training data by applying Item Response Theory (IRT) to responses from Artificial Crowds (AC). This theory-driven IRT-AC approach leads to global (i.e., model-independent) and interpretable difficulty values. Leveraging IRT, we propose a Dynamic Data Selection via Model Ability Estimation (DDS-MAE) strategy to schedule the appropriate amount of data during model training. Since our difficulty labeling and model ability estimation are based on a consistent theory, namely IRT, their values are comparable within the same scope, potentially leading to a faster convergence compared to the other CL methods. Experimental results demonstrate that fine-tuning pre-trained language models with PUDF enhances their performance on the GLUE benchmark. Moreover, PUDF surpasses other state-of-the-art (SOTA) CL methods on the GLUE benchmark. We further explore the components of PUDF, namely the difficulty measurer (IRT-AC) and the training scheduler (DDS-MAE) qualitatively and quantitatively. Lastly, we conduct an ablation study to clarify which components of PUDF contribute to faster convergence and higher accuracy.

curriculum, psychology-based unified dynamic framework, pudf, (13 more...)

arXiv.org Artificial Intelligence

2408.05326

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
Europe > Holy See > Vatican City (0.04)
(12 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Education (1.00)
Energy (0.93)
Media > Music (0.67)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)

Add feedback

VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation

Song, Yixiao, Kim, Yekyung, Iyyer, Mohit

arXiv.org Artificial IntelligenceJun-27-2024

Existing metrics for evaluating the factuality of long-form text, such as FACTSCORE (Min et al., 2023) and SAFE (Wei et al., 2024), decompose an input text into "atomic claims" and verify each against a knowledge base like Wikipedia. These metrics are not suitable for most generation tasks because they assume that every claim is verifiable (i.e., can plausibly be proven true or false). We address this issue with VERISCORE, a metric for diverse long-form generation tasks that contain both verifiable and unverifiable content. VERISCORE can be effectively implemented with either closed or fine-tuned open-weight language models, and human evaluation confirms that VERISCORE's extracted claims are more sensible than those from competing methods across eight different long-form tasks. We use VERISCORE to evaluate generations from 16 different models across multiple long-form tasks and find that while GPT-4o is the best-performing model overall, open-weight models such as Mixtral-8x22 are closing the gap. We show that an LM's VERISCORE on one task (e.g., biography generation) does not necessarily correlate to its VERISCORE on a different task (e.g., long-form QA), highlighting the need for expanding factuality evaluation across tasks with varying fact density.

annotator, search result, verifiable claim, (14 more...)

arXiv.org Artificial Intelligence

2406.19276

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
Europe > Russia (0.04)
(17 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations

Zhao, Yilun, Zhao, Chen, Nan, Linyong, Qi, Zhenting, Zhang, Wenlin, Tang, Xiangru, Mi, Boyu, Radev, Dragomir

arXiv.org Artificial IntelligenceJun-25-2023

Despite significant progress having been made in question answering on tabular data (Table QA), it's unclear whether, and to what extent existing Table QA models are robust to task-specific perturbations, e.g., replacing key question entities or shuffling table columns. To systematically study the robustness of Table QA models, we propose a benchmark called RobuT, which builds upon existing Table QA datasets (WTQ, WikiSQL-Weak, and SQA) and includes human-annotated adversarial perturbations in terms of table header, table content, and question. Our results indicate that both state-of-the-art Table QA models and large language models (e.g., GPT-3) with few-shot learning falter in these adversarial sets. We propose to address this problem by using large language models to generate adversarial examples to enhance training, which significantly improves the robustness of Table QA models. Our data and code is publicly available at https://github.com/yilunzhao/RobuT.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2306.14321

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(25 more...)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Regionalized models for Spanish language variations based on Twitter

Tellez, Eric S., Moctezuma, Daniela, Miranda, Sabino, Graff, Mario, Ruiz, Guillermo

arXiv.org Artificial IntelligenceDec-9-2022

Spanish is one of the most spoken languages in the globe, but not necessarily Spanish is written and spoken in the same way in different countries. Understanding local language variations can help to improve model performances on regional tasks, both understanding local structures and also improving the message's content. For instance, think about a machine learning engineer who automatizes some language classification task on a particular region or a social scientist trying to understand a regional event with echoes on social media; both can take advantage of dialect-based language models to understand what is happening with more contextual information hence more precision. This manuscript presents and describes a set of regionalized resources for the Spanish language built on four-year Twitter public messages geotagged in 26 Spanish-speaking countries. We introduce word embeddings based on FastText, language models based on BERT, and per-region sample corpora. We also provide a broad comparison among regions covering lexical and semantical similarities; as well as examples of using regional resources on message classification tasks.

machine learning, natural language, springer nature 2021, (20 more...)

arXiv.org Artificial Intelligence

2110.06128

Country:

North America > United States (0.14)
South America > Argentina (0.05)
North America > Cuba (0.04)
(35 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (0.93)
Health & Medicine (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

The Venezuelans Trying to Escape Their Country Through Video Game Grunt Work

SlateAug-25-2021, 13:00:00 GMT

On a recent afternoon in Maracaibo, Venezuela, Alexander Marinez, who has short-cropped black hair and three-to-four-day stubble, sat in front of his computer tracking herbiboars in the mushroom forests on Fossil Island. He pressed down on his glowing mouse, the newest addition to his otherwise timeworn gaming setup. The pixelated character on his computer screen followed the tracks of a hedgehoglike creature with triangular tusks and herbs growing out of its back. Outside Marinez's one-story house, the sun bore down on the dirt road. His home lies about six miles away from the strait that connects the Caribbean Sea with Lake Maracaibo, one of the world's richest sources of oil. The character inspected a tunnel. Suddenly, the herbiboar appeared, and the character attacked, stunning it.

marinez, runescape, venezuela, (13 more...)

Slate

Country:

South America > Venezuela > Zulia State > Maracaibo (0.46)
Atlantic Ocean > Caribbean Sea (0.25)
South America > Venezuela > Lake Maracaibo (0.24)
(13 more...)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Government (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Communications (0.95)
Information Technology > Artificial Intelligence > Games (0.51)

Add feedback

This is Artificial Intelligence's dirty little secret Gadgets Now

#artificialintelligenceMar-11-2018, 14:39:08 GMT

SAN FRANCISCO: There's a dirty little secret about artificial intelligence: It's powered by hundreds of thousands of real people. From makeup artists in Venezuela to women in conservative parts of India, people around the world are doing the digital equivalent of needlework _drawing boxes around cars in street photos, tagging images, and transcribing snatches of speech that computers can't quite make out. Such data feeds directly into machine learning'' algorithms that help self-driving cars wind through traffic and let Alexa figure out that you want the lights on. These repetitive tasks pay pennies apiece. But in bulk, this work can offer a decent wage in many parts of the world _ even in the U.S.

algorithm, artificial intelligence, mighty ai, (13 more...)

#artificialintelligence

Country:

Asia > India (0.27)
North America > United States > California > San Francisco County > San Francisco (0.25)
South America > Venezuela > Zulia State > Maracaibo (0.05)
(5 more...)

Industry:

Information Technology (1.00)
Transportation > Passenger (0.51)
Transportation > Ground > Road (0.51)
Consumer Products & Services > Hotels (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.59)

Add feedback